CPSC 330 Lecture 23: Deployment and conclusion

Varada Kolhatkar

Announcements

  • Last lecture today 🥺!
  • HW9 is due Friday, Dec 5th at 11:59 PM (No late submission allowed.)
  • My OH next week has been moved to 11 AM. Do you prefer in-person or Zoom OH?
  • For an in-person OH I’ll book a larger room.

❓❓ Questions for you

Imagine you’ve created a machine learning model and are eager to share it with others. Consider the following scenarios for sharing your model:

  • To a non-technical Audience: How would you present your model to friends and family who may not have a technical background?
  • To a technical audience: How would you share your model with peers or professionals in the field who have a technical understanding of machine learning?
  • In an academic or research setting: How would you disseminate your model within academic or research communities?

Try out this moment predictor

https://cpsc330-moment-predictor.onrender.com/

  • In this lecture, I will show you how to set up/develop this.

What is deployment?

  • After we train a model, we want to use it!
  • The user likely does not want to install your Python stack, train your model.
  • You don’t necessarily want to share the dataset.
  • So we need to do two things:
    1. Save/store your model for later use.
    2. Make the saved model conveniently accessible.

We will use the tools below

For Saving the model

For making the saved model conveniently accessible

Class demo

Course evaluations (~15 mins)

https://canvas.ubc.ca/courses/170662/external_tools/53187

  • They help us improve our teaching!
  • UBC & CS uses them to provide rewards to instructors and TAs who are doing well!
  • UBC & CS uses them to identify where instructors, TAs and courses need additional supports to improve.
  • UBC uses these in evaluating professors for tenure and promotion.
  • I’ll very much appreciate your constructive and concrete feedback.

What did we cover

  • Part 1: Supervised learning on tabular data: ML fundamentals, preprocessing and data encoding, a bunch of models, evaluation metrics, feature importances and model transparency, feature selection, hyperparameter optimization

  • Part 2: Dealing with other non-tabular data types: Clustering, recommender systems, computer vision with pre-trained deep learning models (high level), language data, text preprocessing, embeddings, topic modeling, time series, right-censored data / survival analysis

  • Part 3: Communication, ethics, and deployment

What we didn’t cover

  • How do these models work under the hood

What next?

If you want to further develop your machine learning skills:

  • Practice!

  • Work on your own projects. Make your work available and reproducible.

  • If you are interested in research in machine learning

    • Take CPSC 340. If you do not have the required prereqs you can try to audit it.
    • Get into the habit of reading papers and replicating results

❓❓ Questions for you

For each of the scenarios below

  • Identify if ML is a good solution for a problem. If yes
    • Frame the problem to a ML problem.
    • Discuss what kind of features you would need to effectively solve the problem
    • What would be a reasonable baseline?
    • Which model would be a suitable model for the given scenario?
    • What would be the appropriate success metrics.

QueuePredictor app with call-level data

call_id arrival_time queue_len_at_arrival agents_on_duty is_vip wait_time_sec answered
0 1 2025-01-01 09:01:43 8 10 1 128 1
1 2 2025-01-01 09:03:03 1 5 0 15 1
2 3 2025-01-01 09:04:09 2 8 1 11 1
3 4 2025-01-01 09:04:48 13 10 1 171 1
4 5 2025-01-01 09:05:31 6 10 1 67 1
5 6 2025-01-01 09:07:10 10 7 0 95 0
6 7 2025-01-01 09:08:25 12 6 0 142 0
7 8 2025-01-01 09:09:02 3 9 0 32 1
8 9 2025-01-01 09:10:15 7 8 1 89 1
9 10 2025-01-01 09:11:30 15 5 0 210 0
10 11 2025-01-01 09:12:45 4 9 0 43 1
11 12 2025-01-01 09:14:02 9 7 1 115 1
12 13 2025-01-01 09:15:20 11 6 0 78 0
13 14 2025-01-01 09:16:35 5 10 0 51 1
14 15 2025-01-01 09:17:48 14 5 0 188 0
15 16 2025-01-01 09:19:01 2 8 1 22 1
16 17 2025-01-01 09:20:15 8 9 0 102 1
17 18 2025-01-01 09:21:30 13 6 0 165 0
18 19 2025-01-01 09:22:42 6 10 1 73 1
19 20 2025-01-01 09:23:55 3 9 0 38 1

Scenario: A call center wants to inform callers of their expected wait time when they join the queue.

Available data: Historical calls with both completed (answered) and abandoned (hung up) calls.

QueuePredictor app with interval data

interval_start calls_arrived avg_wait_time_sec callers_abandoned avg_queue_len avg_agents_on_duty
0 2025-01-01 09:00:00 4 81.25 0 6.00 8.25
1 2025-01-01 09:05:00 7 54.29 1 5.29 8.71
2 2025-01-01 09:10:00 3 37.00 0 4.00 9.00
3 2025-01-01 09:15:00 4 147.00 1 11.75 7.00
4 2025-01-01 09:20:00 3 99.67 1 7.00 7.00

Scenario: Same problem, but data is aggregated by 5-minute intervals instead of individual calls.

Available data: Each row represents a time window with summary statistics.

❓❓ More scenarios for practice

App Goal
To-doList App Keep track of the tasks that a user inputs and organize them by date
SegmentSphere App To segment customers to tailor marketing strategies based on purchasing behavior
Video app Recommend useful videos
Dining app Identify cuisine by a restaurant’s menu
Weather app Calculate precipitation in six hour increments for a geographic region
EvoCarShare app Calculate number of car rentals in four increaments at a particular Evo parking spot
Pharma app Understand the effect of a new drug on patient survival time

Conclusion & farewell

That’s all! We made it! I hope you learned something useful from the course. You all are wonderful students and I had fun teaching this course ♥️!

If you didn’t fill out course evaluations during class , it’ll be great if you can fill them in when you get a chance.

Time permitting

  • Class picture 😊